Analysing, visualising and sharing data using QGIS on the VDI

In this notebook we use QGIS as a tool for interacting with data using NCI's web data services, and on the VDI filesystem.

We start by demonstrating one way of getting data into QGIS, and end with a shareable interactive 3D visualisation.

The program for this session is:

  1. Define a research question
  2. Determine which datasets we can use to investigate our question
  3. Obtain the data, and information about the data
  4. Load QGIS (Quantum GIS, http://qgis.org) on the VDI
  5. Find web coverage raster data and import into QGIS
  6. Styling raster data for visualisation
  7. Find cadastral data and import it into QGIS
  8. Install two useful QGIS plugins
  9. Analysis using zonal statistics
  10. Using a 3D visualisation to complete the picture
  11. Interpreting our results
  12. Share our new map

What is QGIS?

QGIS (Quantum GIS) is an application for geospatial data analysis and visualisation, focussed on 2D (raster and vector) datasets, and wrapped in a straightforward graphical user interface. It can also be used as a Python library, but for this session we use the GUI.

I have [insert many other tools] on the VDI - why use QGIS?

QGIS can be used on the VDI as a rapid-prototyping tool. If you've got all the infrastructure you need available in your project space and don't need it, that's great!

We hope to demonstrate some ways which QGIS can help you quickly develop shareable, multi-source analyses quickly and simply. We also revise some web service principles.

1. Define a question

For this session the question we want to address is:

How are the 'hilliness' of suburban areas and and 'vegetation levels' in Canberra related?; and are hiller regions more or less vegetated than flatter regions?

2. Decide on some useful datasets

To answer our question we need some data on topography, vegetation and where block boundaries are.

From NCI's collections we can use:

  • A Shuttle Radar Topography Mission raster DEM at 1 arc-second resolution (link here)
  • Some vegetation indices from the ANU Water and Landscape Dynamics collection - how about photosynthetic vegetation cover. (link here)

NCI doesn't hold cadastral data, so we'll get those from the local land administrator:

  • ACT cadastral data outlining sections (held outside NCI)

Note - here, sections are collections of 50 or so individual house blocks. Our elevation and tree cover data are too coarse for individual house blocks - and the vector data file for blocks is massive!

3. Take a look at the data held at NCI

Jingbo's catalogue stuff...

4. Start QGIS on the VDI

Like most VDI applications, we use the module load system:

module purge
module load qgis/2.2.0 gdal/1.9.2
qgis &

This will start QGIS - which we'll use shortly. For now, head to Firefox - we are off to discover some data!

5. Get topography data using Web Coverage Services

For topography we need a terrain model in a raster format, e.g. GeoTIFF, which covers Canberra.

Shuttle Radar Topography Mission (SRTM) data are a good source of reasonably detailed elevation data held at NCI as NetCDF tiles.

We can head to the NCI Elevation collection here:

http://dapds00.nci.org.au/thredds/catalog/rr1/Elevation/catalog.html

...and look for SRTM 1 second elevation as NetCDF. Browse to the NetCDF folder, click through to:

http://dapds00.nci.org.au/thredds/catalog/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/catalog.html

A bunch of 1 degree tiles are shown here - but we will collect a region we want from the national mosaic:

http://dapds00.nci.org.au/thredds/catalog/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/catalog.html?dataset=rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc

Click on the 'WCS' link to see the WCS getCapabilities statement - which describes the data you can obtain here. We need to know the name for the coverage we need - look for the 'name' tag. With that,and using our WCS knowledge, we can request just the part of the data we need and acquire a GeoTIFF image. Let's break a WCS request down into parts we need:

  • the path to the data: http://dapds00.nci.org.au/thredds/wcs/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/
  • the dataset: Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc
  • the service: service=WCS
  • the service version: version=1.0.0
  • the thing we want to do (get a coverage): request=GetCoverage
  • the coverage (or layer) we want to get: Coverage=elevation
  • the boundary of the layer we want: bbox=148.5.0,-35.5,149.5,-34.5
  • the format we want to get our coverage as: format=GeoTIFF_Float

To build a WCS request we concatenate the data path and dataset name, put a question mark after the dataset name, then add the rest of the labels describing the thing we want afterward, in any order, separated by ampersands:

http://dapds00.nci.org.au/thredds/wcs/rr1/Elevation/NetCDF/1secSRTM_DEMs_v1.0/DEM/Elevation_1secSRTM_DEMs_v1.0_DEM_Mosaic_dem1sv1_0.nc?service=WCS&version=1.0.0&request=GetCoverage&Coverage=elevation&bbox=148.5.0,-35.5,149.5,-34.5&format=GeoTIFF_Float

Enter this link into a web browser to obtain a GeoTIFF DEM in your default download location (check in 'downloads'). Rename it to something memorable if you wish.

Note the use of GeoTIFF_Float - using only GeoTIFF is possible, but gives you an image with pixel values scaled to a colour range,m not a data range

Now go to QGIS and import the GeoTIFF as a raster layer:

b. Vegetation data

We repeat this process with data on photosynthetically active vegetation, which comes from the ANU Water and Landscape Dynamics collection:

http://dapds00.nci.org.au/thredds/catalog/ub8/au/FractCov/PV/catalog.html

We'll look at 2015:

http://dapds00.nci.org.au/thredds/catalog/ub8/au/FractCov/PV/catalog.html?dataset=ub8-au/FractCov/PV/FractCover.V3_0_1.2015.aust.005.PV.nc

Using WCS again - click on the WCS link, and look for a tag - it says 'PV'. This is the coverage we need to get. So we form a WCS request like this:

  • the path to the data: http://dapds00.nci.org.au/thredds/wcs/ub8/au/FractCov/PV/
  • the dataset: FractCover.V3_0_1.2015.aust.005.PV.nc
  • the service: service=WCS
  • the service version: version=1.0.0
  • the thing we want to do (get a coverage): request=GetCoverage
  • the coverage (or layer) we want to get: Coverage=TreeCover
  • the boundary of the layer we want: bbox=149.0,-36,149.9,-35
  • the format we want to get our coverage as: format=GeoTIFF_Float

...which are assembled into a WCS request:

http://dapds00.nci.org.au/thredds/wcs/ub8/au/FractCov/PV/FractCover.V3_0_1.2015.aust.005.PV.nc?service=WCS&version=1.0.0&request=GetCoverage&Coverage=PV&bbox=149.0,-36,149.9,-35&format=GeoTIFF_Float

Again, using this link in a web browser will result in a GeoTIFF file arriving at your default download location. Load it into QGIS:

5. 6. Style our vegetation layer

So far we have a bunch of grey things - let's make them colourful!

We don't need to worry about the elevation data, but we do need a useful colour scheme for vegetation data and our cadastral data. Double click your vegetation layer (in the layers panel) to bring up it's properties dialogue. From there:

  1. Click 'style'
  2. In the 'render type' dropdown, choose 'Singleband pseudocolor'
  3. Leave interpolation as 'linear', and pick a colour palette. Because the data are continuous, a single hue continuous colour palette makes sense. Because it's vegetation, green also makes sense. Choose what makes sense to you.
  4. Click 'classify' to show the palette in the big window, and apply with with 'apply'.
  5. Click 'close' to return to your map.

(image above missing 'classify' step)

Extension:

Steaming ahead? Add a graticule to your QGIS map layer...

Challenge question:

How could you automagically save WCS data with sane, memorable names?

Q & A

Why not just use the QGIS OWS browser to grab these data?

  1. The QGIS WCS browser and THREDDS don't play well - QGIS attenuates the full THREDDS WCS URL, so it is far more convenient to form a request independently of QGIS.
  2. OWS layers can't be used for processing - ie band math, and the visualisation tools we're about to use.

If you're keen, do some exploring - we don't know *everything* about QGIS - surprise us!

Why not get these data from the filesystem?

Great question! The main reason is that we're demonstrating QGIS as a way to fuse data from manifold sources. Over today and tomorrow you'll see a log examples of how to collect data from the filesystem in manageable chunks - which you could then analyse/visualise using the methods shown here. Demonstrating web coverage services on the VDI shows how you can pull data from many external sources to help interpret your work.

7. Find cadastral data and import it into QGIS

ACT publish a lot of spatial data on their ACTMAPi interface. To shortcut, here are the cadastral section data:

http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3

You can download a shapefile and import it to QGIS, but let's show a feature you might like to use: adding vector data as a service. At ACTMAPi, click the 'API' menu box, and copy the URL out of the GeoJSON tetxtbox:

http://actmapi.actgov.opendata.arcgis.com/datasets/eedcda7873934e789093b093521b0299_3.geojson

In QGIS, Click 'add new vector layer', select the 'protocol' radio button, and past the URL in. The layer will load - but like raster data, it can't be used for analysis. Save it as a Geopackage and reload the layer (delete the GeoJSON layer).

Extension:

Why Geopackage? Why not a shapefile?
Hint: http://www.geopackage.org

8. Install two useful QGIS plugins

To procees we need to install two QGIS plugins - Zonal statistics and Qgis2threejs:

  1. head to the 'plugins' menu and choose 'manage and install plugins'
  2. in the resulting dialogue box, search for 'zonal statistics'
  3. select the 'Zonal statistics plugin' and click 'install'.

Do the same for a plugin named 'Qgis2threejs'.

9. Analysis using zonal statistics

We will try to show:

  • Standard deviaton of elevation of blocks as a proxy for hilliness, plotted as a volume on the elevation map
  • Sum of photosynthetic vegetation present for each block, also as a volume on the map

We start with Zonal statistics. This collects data from raster layers using polygons in a vector layer and computes basic statistics for each polygon.

a. Section hilliness

As a proxy for section hilliness, we'll use the standard deviation of elevation in each block polygon.

Head to raster -> zonal stats to open the plugin.

In the zonal statistics plugin dialogue, choose the DEM as the raster layer, and use band 1. Then choose your ACT blocks vector layer. In the statistics to calculate, pick an appropriate set - but include standard deviation - this is our roughness proxy. Add a meaningful prefix to the statistics (e.g. 'dem_'), so you can find them when you need to use them.

QGIS will spend some time calculating stats for each block - grab a cup of tea!

b. Vegetation per block

For vegetation cover, each grid cell/pixel shows a percentage of photosynthetically active vegetation. As a quick measure we could sum all the pixel values inside a block to get an idea of it's vegetation coverage relative to other blocks.

In the zonal statistics plugin dialogue, choose the DEM as the raster layer, and use band 1. Then choose your ACT sections vector layer. In the statistics to calculate, the values we need are preselected - we'll use the mean value for each section. Again, use a meaningful prefix (for example, 'veg_')

c. Use vegetation per block to style our vector layer

We avoided styling our vector layer earlier, but now it's time - since we want to visualise the vegetation coverage in each section.

Double click your act_sections layer to open it's properties dialogue, and head to the style tab. Here, we want to apply a good looking colour scale based on mean vegetation cover (the data we just generated). Set up as follows:

...which results in the following map:

So far we can visualise the vegetation cover of sections in the ACT - but how can we relate that quickly and easily to section hillines? And how could we make a way to show the result in a convincing way?

Challenge question:

is a graphical GIS the only way to collect zonal statistics using a vector layer to segment raster data?

10. Using a simple 3D visualisation to complete the picture

So far we've imported three different datasets into QGIS and created some new attributes on our vector dataset. How

  • classify and colour blocks by vegetation cover
  • visualise block hilliness as the height of an extruded column

In this scheme, if hillier blocks are more vegetated, dark green blocks will visualise as taller columns. If hiller blocks are less vegetated, light green blocks will visualise as taller columns. Lets test it out!

Here we use the second plugin - Qgis2threejs. This renders the current screen to a WebGL map in a browser using three.js - with some neat data visualisation features.

a. Setting up - coordinate transformation in QGIS

So far we've worked in a WGS84 (EPSG:4326) coordinate system - but in order to render our map in three.js, we need to project our data into something which has units of metres, not degrees. Let's choose GDA94/MGA55 (EPSG:28355):

  1. Locate the panel at the lower right of the QGIS window which says 'EPSG:4326'
  2. Click it to open a CRS selection dialog
  3. Select 'Enable 'on the fly' CRS transformation (OTF)' at the top right
  4. Enter 'MGA 55' in the 'filter' box, then highlight GDA94 /MGA 55 in the 'coordinate systems of the world...' box. It should show up in the 'selected CRS' panel.
  5. Click OK.

You'll see that everything has warped a touch, and your CRS panel (lower right) reflects your choice. Proj.4 handles all the rest for you!

b. Set a clipping frame

Qgis2threejs attempts to render the whole map window as webGL. We want to limit or map to the extents of our DEM - so we can either zoom the map window in so that our region of interes occupies the whole window, or set a clipping polygon in a new vector layer. Here, we zoom in so that our region of interest fills the map window:

c. Setting up in Qgis2threejs

Head to web -> Qgis2threejs to open the plugin dialogue.

(some screebgrabs showing qgis2threejs setup)

d. show the map!


In [6]:
###ignore this block of code - it is required only to show the map in iPython - you won't need it!
from IPython.core.display import display, HTML
display(HTML('<iframe width="800" height="600" frameborder="1" scrolling="no" src="./qgis2threejs/veg_by_hilliness.html"></iframe>'))


11. So what do our results mean? Interpreting our picture

If hilly blocks have generally more vegetation than flatter blocks, short columns should be lighter shades of green (in this colour scheme).

...but that's not what we see! Why not?

...and are we using the right data?

Extension activities

  1. Which suburb is the greenest? which is the hilliest? * **hint** - use another QGIS plugin, and another web mapping data source*
  2. Choose a different dataset to analyse - using OWS or from the filesystem.
  3. Update your map to show satellite imagery from, for example, OpenStreetMap. Create a new model and see if you can visually confirm the results of the analysis.
  4. Create a totally new interactive 3D model using the stack we've just been working with, and give us a URL to see your work!
  5. Visualise the same result a different way - do we *really* need qgis2threejs? Could we do this whole process in Python?

12. Sharing our new map

The folder you just created using QGIS2threeJS can be dropped into any web server - to share with collaborators.

You can't serve data directly from the VDI - but you can copy your results to some web host (eg github.io). Here's an example:

https://adamsteer.github.io/nci_samples/qgis2threejs/veg_by_hilliness.html

What are some other ways to share QGIS projects?

Challenge question possible solutions

Sane WCS coverage request names

One option is to use curl on the command line:

curl -o SRTM_dem_139_36.tiff 'http://dapds00.nci.org.au/thredds/wcs/ub8/au/FractCov/PV/FractCover.V3_0_1.2015.aust.005.PV.nc?service=WCS&version=1.0.0&Request=GetCoverage&Coverage=PV&bbox=149.0,-36,149.9,-35&format=GeoTIFF_Float'
curl -o 2015_treecover_139_36.tiff 'http://dapds00.nci.org.au/thredds/wcs/ub8/au/FractCov/PV/FractCover.V3_0_1.2015.aust.005.PV.nc?service=WCS&version=1.0.0&Request=GetCoverage&Coverage=PV&bbox=149.0,-36,149.9,-35&format=GeoTIFF_Float'

...are there others? What was yours?

Zonal statistics right in a notebook

You could also try rasterstats: https://github.com/perrygeo/python-rasterstats - any other suggestions?

Other ways to share your QGIS projects, and other 3D visualisers

For simple projects with CCBY4 data, the QGIS cloud is one way of sharing your results. Another is a Jupyter notebook hosted on github as a gist, a notebook, or as a github.io page. What was your approach?

Loading OWS data is also possible in hosted 3D visualisers, for example NationalMap. Here's a sample of some XXXX data from NCI visualised using the NationalMap interface: (url) . What else do you use?

Thanks! Discussion and suggestions welcome.


In [ ]: